Augmented Machine Decision Making

ABSTRACT

Sensor data is received. The sensor data is classified into one of two or more classes by at least requesting processing of a machine computational component, receiving a result of the machine computation component, requesting processing of an agent computation component, and receiving a result of the agent computation component. The agent computation component includes a platform to query an agent. The result from the agent computation component or the result from the machine computation component is provided. Related apparatus, systems, techniques, and articles are also described.

RELATED APPLICATION

This application claims priority under 35 U.S.C. §119(e) to U.S.provisional application No. 62/237,733 filed Oct. 6, 2015, the entirecontents of which are hereby expressly incorporated by reference herein.

TECHNICAL FIELD

The subject matter described herein relates to improving machinedecision making.

BACKGROUND

In artificial intelligence (AI), difficult problems are informally knownas AI-complete or AI-hard, implying that the difficulty of thesecomputational problems is equivalent to that of solving the centralartificial intelligence problem, which is making computers asintelligent as people, also referred to as strong AI. An AI-completeproblem is one not solved by a simple specific algorithm. AI-completeproblems include computer vision, natural language understanding,dealing with unexpected circumstances while solving any real worldproblem, and the like. Currently, AI-complete problems cannot be solvedwith modern computer technology alone.

Current AI systems can solve very simple restricted versions ofAI-complete problems, but never in their full generality. When AIresearchers attempt to “scale up” their systems to handle morecomplicated, real world situations, the programs tend to becomeexcessively brittle without commonsense knowledge or a rudimentaryunderstanding of the situation. In other words, they fail as unexpectedcircumstances outside of its original problem context begin to appear.When human beings are dealing with new situations in the world, theyknow what to expect: they know what all things around them are, why theyare there, what they are likely to do and so on. Humans can use contextand experience to guide them in recognizing unusual situations andadjusting accordingly. A machine without strong AI has no other skillsto fall back on so some machine decision-making applications areintractable.

SUMMARY

In an aspect, sensor data is received. The sensor data is classifiedinto one of two or more classes by at least requesting processing of amachine computational component, receiving a result of the machinecomputation component, requesting processing of an agent computationcomponent, and receiving a result of the agent computation component.The agent computation component includes a platform to query an agent.The result from the agent computation component or the result from themachine computation component is provided.

In another aspect, sensor data of a security system asset is received. Apredefined modality associated with the security system asset isaccessed. The modality defining a computational task for analyzing thereceived sensor data. A solution state machine object having a pluralityof states and rules for transitioning between the plurality of state isinstantiated. The plurality of states includes an initial state, a firstintermediate state, a second intermediate state, and a terminal state.The task is executed using the solution state machine object. Theexecuting includes requesting processing of the task by, and receiving aresult of, a machine computation component when a current state of thesolution state machine object is the first intermediate state. Theresult received from the machine computation component includes a firstconfidence measure. The executing includes requesting processing of thetask by, and receiving a result of, an agent computation component whenthe current state of the solution state machine object is the secondintermediate state. The result received from the agent computationcomponent including a second confidence measure. The executing includestransitioning the current state of the solution state machine objectaccording to the transition rules and at least one of: the firstconfidence measure and the second confidence measure. A characterizationof the terminal state is provided when the current state of the solutionstate machine object is the terminal state.

One or more of the following features can be included in any feasiblecombination. For example, processing of the agent computation componentcan be requested when a confidence of the machine computation componentresult is below a first threshold. Processing of the agent computationalcomponent can be requested when the confidence of the machinecomputation component result is above a second threshold. Providing caninclude requesting further processing of the agent computation componentresult by the machine computation component. The providing can includerequesting further processing of the machine computation componentresult by the agent computation component.

The machine computation component can include a deep learning artificialintelligence classifier. The machine computation component can detectobjects and classify objects in the sensor data. The sensor data caninclude an image.

A composite result from the machine computation component result and theagent computation component result can be determined. The determiningcan include using a measure of result confidence. At least one of thereceiving, classifying, and providing can be performed by at least onedata processor forming part of at least one computing system.

The machine computation component can execute a machine learningalgorithm to perform the task. The machine computation componentincludes a convolutional neural network.

The agent computation component can include a platform that queries atleast one agent, receives a query result, determines a confidencemeasure of the agent, and determines the second confidence measure usingthe confidence measure of the queried agent.

The sensor data can include an image including a single image, a seriesof images, or a video. The computational task can include: detecting apattern in the image; detecting a presence of an object within theimage; detecting a presence of a person within the image; detectingintrusion of the object or person within a region of the image;detecting suspicious behavior of the person within the image; detectingan activity of the person within the image; detecting an object carriedby the person, detecting a trajectory of the object or the person in theimage; a status of the object or person in the image; identifyingwhether a person who is detected is on a watch list; determining whethera person or object has loitered for a certain amount of time; detectinginteraction among person or objects; tracking a person or object;determining status of a scene or environment; determining the sentimentof one or more people; counting the number of objects or people;determining whether a person appears to be lost; determining whether anevent is normal or abnormal; and/or determining whether text matchesthat in a database.

The security system asset can include an imaging device, a video camera,a still camera, a radar imaging device, a microphone, a chemical sensor,an acoustic sensor, a radiation sensor, a thermal sensor, a pressuresensor, a force sensor, or a proximity sensor. The modality can definesolution state machine object attributes, acceptable confidence forreaching the terminal state, a set of assets that trigger the modality,and/or agent query structure.

Executing the task can include posting, via a messaging queuingprotocol, requested processing tasks. The machine computation componentand agent computation component can include microservices operating ontasks posted via the messaging queue protocol.

A predictive model of the machine computation component can be modifiedusing the result received from the agent computation component as asupervisory signal and the received sensor data as input.

At least one of the receiving, accessing, instantiating, executing, andproviding is performed by at least one data processor forming part of atleast one computing system.

In yet another aspect, sensor data is received. The sensor data isclassified into a first class by at least requesting processing of amachine computational component, receiving a first result of the machinecomputation component, requesting processing of an agent computationcomponent, and receiving a first result of the agent computationcomponent. The agent computation component includes a platform to queryan agent. The sensor data can be classified into a second class by atleast requesting processing of the machine computational component,receiving a second result of the machine computation component,requesting processing of the agent computation component, and receivinga second result of the agent computation component. A set of rules isapplied to the first class and the second class to enable adetermination of a composite classification. The composite result isprovided.

In yet another aspect, first sensor data of a first security systemasset and second sensor data of a second security system asset arereceived. A first predefined modality associated with the first securitysystem asset and a second predefined modality associated with the secondsecurity system asset is accessed. The first modality defines a firstcomputational task for analyzing the received first sensor data. Thesecond modality defines a second computational task for analyzing thereceived second sensor data. A first solution state machine object and asecond solution state machine object are instantiated. The firstsolution state machine object has a plurality of states and rules fortransitioning between the plurality of state. The plurality of statesincludes an initial state, a first intermediate state, a secondintermediate state, and a terminal state. A result of the first task anda result of the second task are determined by executing the first taskusing the first solution state machine object and the second task usingthe second solution state machine object. The executing includesrequesting processing of the first task by a machine computationcomponent and an agent computation component. A composite result isdetermined by applying a set of rules to the result of the first taskand the result of the second task. The composite result is provided.

One or more of the following features can be included in any feasiblecombination. The set of rules can include matching sensor data within apredetermined time-window. The providing can include requesting furtherprocessing of the machine computation component result by the agentcomputation component. The machine computation component can detectobjects and classifies objects in the sensor data. At least one of thereceiving, classifying, and providing can be performed by at least onedata processor forming part of at least one computing system. The sensordata can include a first image of a first security system asset and asecond image of a second security system asset. At least one of thereceiving, accessing, instantiating, executing, and providing can beperformed by at least one data processor forming part of at least onecomputing system.

Non-transitory computer program products (i.e., physically embodiedcomputer program products) are also described that store instructions,which when executed by one or more data processors of one or morecomputing systems, causes at least one data processor to performoperations herein. Similarly, computer systems are also described thatmay include one or more data processors and memory coupled to the one ormore data processors. The memory may temporarily or permanently storeinstructions that cause at least one processor to perform one or more ofthe operations described herein. In addition, methods can be implementedby one or more data processors either within a single computing systemor distributed among two or more computing systems. Such computingsystems can be connected and can exchange data and/or commands or otherinstructions or the like via one or more connections, including but notlimited to a connection over a network (e.g. the Internet, a wirelesswide area network, a local area network, a wide area network, a wirednetwork, or the like), via a direct connection between one or more ofthe multiple computing systems, etc.

The details of one or more variations of the subject matter describedherein are set forth in the accompanying drawings and the descriptionbelow. Other features and advantages of the subject matter describedherein will be apparent from the description and drawings, and from theclaims.

DESCRIPTION OF DRAWINGS

FIG. 1 is a process flow diagram illustrating a method of augmentingartificial intelligence with human intelligence tasks;

FIG. 2 is a state diagram of an example solution state machine asdefined by a modality;

FIG. 3 is a diagram illustrating composite modalities to solve ahigher-order problem;

FIG. 4 is a process flow diagram illustrating a method of augmentingartificial intelligence using composite modalities;

FIG. 5 is a system block diagram of an example analysis platformincluding software components for combining machine and humanintelligence as a solution for responding to questions and problemscenarios;

FIG. 6 illustrates an exchange of an event messaging system;

FIG. 7 illustrates data flow between components of a platform during aprocess of augmenting artificial intelligence with human computation;

FIG. 8 is a block diagram illustrating example metadata;

FIGS. 9-11 are tables illustrating example modalities and examplesecurity scenarios to which the modality can apply;

FIG. 12 is a system block diagram of an example machine computationcomponent system that implements a deep learning based object detector;

FIG. 13 illustrates an example input image and an example output imageto an artificial intelligence system;

FIG. 14 is a system block diagram illustrating an object detector webapplication program interface (API);

FIG. 15 is a system block diagram illustrating an example systemincluding a human-computation element and a machine decision-makingalgorithm;

FIG. 16A is a process for injecting human-computation into a machinedecision-making algorithm;

FIG. 16B illustrates an example image;

FIG. 17 is a system block diagram illustrating an example implementationof the current subject matter for a video/face recognition system;

FIGS. 18 and 19 are process flow diagrams illustrating using the currentsubject matter for face recognition and using the face recognitionsystem;

FIGS. 20 and 21 illustrate applying the current subject matter to handlea wide variety of tasks, such as counting sports utility vehicles (SUVs)in a parking lot or validating computer vision analytic performance; and

FIG. 22 is a block diagram illustrating an example of hardware used bythe current subject matter.

Like reference symbols in the various drawings indicate like elements.

DETAILED DESCRIPTION

The current subject matter relates to utilizing a “human-in-the-loop”(symbiotic human-machine) approach to facilitated decision making.Humans can contribute entirely new decisions/answers or assist when AIis not highly confident, and in that way are augmenting/assisting themachine process in solving a particular task, not merely verifying thecomputer decision making. The current subject matter can expand therange of use cases to which a machine decision making system or a givensensor and/or analytic may effectively apply. The current subject mattercan provide for injection of a human-computation element into a machinedecision-making algorithm, allowing for a human to perform (or solve)specific and narrow decisions that the machine decision making systemwould otherwise be unable to perform (or would perform poorly). Thesubject matter can be used with applications that do not currentlyinclude machine decision-making algorithms or use algorithms that do notadequately meet user needs, for example a closed circuit televisionsystem that currently does not have a machine decision-making algorithmor has limited machine decision-making capability. The current subjectmatter can enable new capabilities and improve machine decision making,for example, by reducing false alarms, increasing hits, reducing misses,and increasing correct rejections.

While great advances have been made in the area of artificialintelligence, the performance of software-only systems often falls shortof that which is needed for applications involving analysis of physicalworld imagery, video, language processing, and the like. Key challengesfor end users are the prevalence of false positives (“false alarms”),the variation in system performance caused by changes in circumstancesor scene type (“brittleness”), and the inability for these systems toproduce human-like outputs in scenarios that are highly subjective orcontextual (as is frequently the case in the physical security domain).The current subject matter includes data analysis and handling that useshuman-in-the-loop processing (also referred to as human intelligencetasks) alongside artificial intelligence to address the aforementionedchallenges, by combining the respective strengths of computer and humanprocessing, while minimizing the amount of human involvement required.

The current subject matter can include an analysis platform foraugmenting machine processing with human intelligence tasks to improveperformance and reduce false alarms. The analysis platform can include amachine computation component, described more fully below, that caninclude predictive models built using a machine learning algorithm, forexample, a deep neural network. The machine computation component canclassify input data into two or more classes.

The analysis platform can include an agent computation component,described more fully below, that can include a system for querying apool of agents (e.g., humans) to perform a task, such as detecting apresence of a person or object in an image, answering a questionregarding a characteristic of the image, and the like. In someimplementations, the agent computation component can provide a queryresult in substantially real-time, such as within 5, 10, or 30 secondsof receiving a query request. In some implementations, the analysisplatform can be applied to a physical security and surveillance domain,which is highly subjective and contextual.

The current subject matter can include use of modalities, which enablesany given problem to be broken or segmented into computational tasks.Some tasks may be better performed by an existing artificialintelligence predictive model while other tasks may be better performedby a human. Thus, the current subject matter can route a given task forprocessing by either an artificial intelligence processing component oran agent (e.g., human) processing component. The concept of modalitiescan be extended to composite modalities, whereby multiple modalities arecombined (e.g., strung together) to solve more difficult and evensubjective tasks. Composite modalities can be accurate because theconfidence of the result of each underlying modality can be high (e.g.,treated as truth).

Because some artificial intelligence systems can be continually trained,their performance can improve over time. The current subject matter canroute tasks based on machine performance, which can be represented by aconfidence metric produced by the artificial intelligence system. As theartificial intelligence component is trained on more real-world data,the artificial intelligence component will become more accurate and lessagent input is required. Thus, the relative processing burdens betweenthe artificial intelligence component and the human intelligencecomponent is dynamic and can vary over time.

FIG. 1 is a process flow diagram illustrating a method 100 of augmentingartificial intelligence with human intelligence tasks. The method 100 ofaugmenting artificial intelligence with human intelligence tasks isimplemented using flow control, which can be represented as a statemachine for solving a computational task.

At 110, sensor data is received. The sensor data can be received fromand/or of a security system asset. An asset can include an imagingdevice, a video camera, a still camera, a radar imaging device, amicrophone, a chemical sensor, an acoustic sensor, a radiation sensor, athermal sensor, a pressure sensor, a force sensor, a proximity sensor ora number of other sensor types. “Sensor,” as used herein may includeinformation that did not originate specifically from physical hardware,such as a computer algorithm. The sensor data can include, for example,an image (e.g., optical, radar, and the like), video, audio recording,data generated by any of the above-enumerated assets, and the like. Insome implementations, the sensor data can be from a system other than asecurity system, for example, the sensor data can be access controlsystem data, weather system data, data about the risk posed by anindividual or the risk of a security threat given a set of conditions.Other system types are possible.

The security system can include a number of deployment types includingclosed circuit television, surveillance camera, retail camera, mobiledevice, body cameras, drone footage, personnel inspection systems,object inspection systems, and the like.

The security system can be implemented in many ways. For example, thesecurity system can include a system to detect for physical intrusioninto a space (e.g., whether a person is trespassing in a restrictedarea); a system to determine whether an individual should or should notbe allowed access (e.g., a security gate); a system to detect forobjects, people, or vehicles loitering in a region; a system to detectfor certain behavior exhibited by a person (e.g., suspicious behavior);a system to detect track a person or object viewed from one asset (e.g.,camera) to another asset; a system to determine the status of an objectin the asset field of view (e.g., whether there is snow on a walkway); asystem to count people or objects (e.g., vehicles) in a scene; a systemto detect for abnormal conditions (e.g., as compared to a baselinecondition); a system to detect license plates over time; a system todetect for weapons, contraband, or dangerous materials on a person orwithin a container (e.g., a security checkpoint); and the like.

At 120, a predefined modality is accessed. The accessing can be frommemory. The predefined modality can be associated with the securitysystem asset. The modality can define a computational task for analyzingthe received sensor data. For example, where the asset is a videomonitoring the threshold of a building, the predefined modality caninclude a computational task that specifies that an image taken by theasset should be processed to detect for a presence of a person in thethreshold (e.g., a region of the image). Associated with the asset canbe a collection of configurable data that can be provided for each assetmodality pairing. Asset details can include, for example, inclusionareas, exclusion areas, filtering parameters, region of interestrequirements and the like. These are all specific to the asset scene forthat modality.

A modality can be considered an architectural concept that, when used asbuilding blocks, can capture a pattern of security objectives. Ananalysis platform can expose modalities as building blocks for clearlyarticulating the problem to be solved. An example modality can includean intrusion detection scenario, where the pattern represented is one offirst detecting that a trigger has happened and that the trigger wascaused by a human and that the human is intruding upon a defined area. Amodality can guide and coordinate machine computation components andagent computation components of the platform. Modalities can providedirection to the analysis platform regarding what the security system istrying to detect or control.

In some implementations, the predefined modality can define a solutionstate machine or flow control that provides a framework for utilizingthe processing components of the analytical platform to solve a problem(which could be a piece of a larger scenario). Each computationcomponent can have access to the solution state machine and can advancethe state. The solution state machine can have states including aninitial state, intermediate states, and terminal states. Each state cancorrespond to a particular type of processing by components of theplatform. For example, to do trigger detection, a flow control firsttries to use a machine computation component to determine if a person isin the frame, and once a person is detected, sends the frame to an agentto determine if the person has crossed a determined threshold. Further,flow control can be used to ensure or improve a certain level ofconfidence in a determination thereby reducing false alarms. Forexample, if the machine computation component detects the presence of aperson in the frame but returns a low confidence (e.g., characteristicsof the image make it challenging for the predictive model to accuratelyperform) the analysis platform can utilize the agent computationcomponent to process the task (e.g., detect whether a person is presentin the frame). The agent computation component can utilize humanjudgement to perform the task, which may be better suited than themachine computation component.

Sensor data can be initially processed to detect an event. Initialprocessing can include video motion detection that, when motion isdetected, triggers an event. In some implementations, the initialprocessing can include video analytics that, when an object of interestis detected or rule is satisfied, triggers an event. Occurrence of anevent can start a new state machine invocation (embodied in a task) andmaintains a state throughout its tasking until it reaches a terminalnode. Every participant involved in the process of solving the problemcan access and potentially advance the state machine.

The solution state machine can include transition rules fortransitioning between states. These rules can be based on the confidenceof the associated processing that takes place when the solution statemachine is in the state. The solution state machine can also berepresented as a directed graph where intermediate nodes correspondingto processing components and edges define transition rules.

The predefined modality can also define or include: the ultimatequestion to be answered, which can be customized for the particularmodality; an agent work form, which can be the type of form agents wouldbe served to best answer the question based on the artifacts receivedfrom the assets; an acceptable confidence, which is the acceptablethresholds for considering the question met or not by an artifact; and aset of assets which can trigger tasking for this modality (in otherwords, the set of sensors/cameras that provide the artifacts specific tothis tasking where each asset in the set is independent, meaning, thattriggering from each one will cause independent tasking and questionresolution).

Concretely, FIG. 2 is a state diagram of an example solution statemachine 200 as defined by a modality. “S” is a start state, “MI” is afirst machine intelligence state, “MI2” is a second machine intelligencestate, “HI” is a human intelligence state, “ES” is a terminal statecorresponding to a successful match (e.g., pattern match,classification, detection, and the like), and “EF” is a terminal statecorresponding to an unsuccessful match (e.g., pattern match,classification, detection, and the like). “C” relates to confidence ofthe processing at each state and “T” relates to the number of times theassociated processing has been performed. Transition rules are Booleanoperators of the confidence (“C”) and processing times (“T”).

Referring again to FIG. 1, at 130, a flow control, or solution statemachine object can be instantiated. The instantiating creates a concreteoccurrence of the solution state machine object that exists duringruntime. The solution state machine object can have at least twointermediate states, one associated with machine computation componentprocessing and one associated with agent computation componentprocessing.

At 140, the computational task is executed using the solution statemachine object. A solution state machine object can be represented inpersistent data as a transition table and can be accessible for queryingand changing state. Executing the computational task using the solutionstate machine object provides a data driven means of orchestrating theparticipants in the analysis platform to drive the participants (e.g.,the computation components) closer to a confident solution or quickly toa non-solution. A data driven flow can eliminate the need of actualcoding a solution and can allow distributed components to cooperate ondriving the state machine for any external request.

Execution of the computational task can include, at 142, requestingprocessing of the task by, and receiving a result of, a machinecomputation component when the current state of the solution statemachine object is in a machine computation component state. The machinecomputation component can execute the task by applying a predictivemodel to the sensor data to determine an output (e.g., pattern match,classification, detection, and the like). The machine computationcomponent can also determine a confidence measure of its output. Theconfidence measure can characterize a likelihood that the output of themachine computation component is correct. For example, in theimplementation where the machine computation component is aconvolutional neural net, the convolutional neural network's last layercan be a logistic regression layer, which classifies image patches intolabels. During the training phase this value can be set to 1 forpositive examples and to 0 for negative examples. During the operationalphase (e.g., when applying new data to the convolutional neural network)this value can be the probability of an input image being the object ofinterest.

Execution of the computational task can include, at 144, requestingprocessing of the task by, and receiving a result of, an agentcomputation component when the current state of the solution statemachine object is in an agent computation component state. The agentcomputation component can execute the task by querying one or moreagents in a pool of agents to perform the task, such as an imagerecognition task, answering a question regarding a characteristic of theimage, and the like. In some implementations, the agent computationcomponent can provide a query result in substantially real-time, such aswithin 5, 10, or 30 seconds of receiving a query request. The agentcomputation component can also determine a confidence measure of itsoutput. The confidence measure may be directly supplied by an agent orcan be determined by the agent computation component using an algorithmthat assesses the accuracy and reliability of the agent that provides aresponse. The agent computation component can query multiple agents andcreate a composite output and a composite confidence. The confidencemeasure can characterize a likelihood that the output of the agentcomputation component is correct.

Execution of the computational task can include, at 146, transitioningthe current state of the solution state machine object according to thetransition rules. For a given state, the transition rules can be appliedwhen a result of a computation component is returned by a respectivecomputation component. By applying the transition rules, the currentstate of the solution state machine can change (according to thetransition rules) and, when a new state is entered, an associatedprocessing step can be performed.

Execution of the computation task can include one or more of requestingprocessing of the task by, and receiving a result of, a machinecomputation component 142; one or more of requesting processing of thetask by, and receiving a result of, an agent computation component 144;and one or more of transitioning the current state of the solution statemachine object according to the transition rules 146. Execution of thecomputation task can be performed according to the solution statemachine object states and transition rules as specified in thepredefined modality. Thus, execution of the computation task is aflexible process that can vary, for example, between tasks and specificcontent of the sensor data.

Once the current state of the solution state machine object is aterminal state, at 150, a characterization of the terminal state can beprovided. The characterization may relate to a classification of thesensor data (according to the task). For example, if the task beingperformed is to detect whether or not there is a person in an image, agiven solution state machine object may have two terminal states, afirst terminal state (e.g., a solution state) that is reached if theagent computation component and machine computation component canprovide a classification with a certain level of confidence (e.g., 0.9),and a second terminal state (e.g., a non-solution state) when the agentcomputation component and machine computation component cannot provide aclassification with a certain level of confidence (e.g., less than 0.9).

The characterization of the terminal state can be provided, for example,as an alert to a manager of the security system. For example, thesecurity system manager may have an escalation policy that requires theybe alerted regarding the outcome of the task if the task detects acertain condition (e.g., intrusion into the building is occurring). Theescalation alert can be in any form, such as MMS, SMS text, email, andthe like.

Modalities can be considered as processing building blocks that answerrelatively basic tasks. For example, FIGS. 9-11 are tables illustratingexample modalities and example security scenarios to which the modalitycould apply. Modalities are flexible and a powerful tool for problemsolving within the context of a human augmented machine decision makingsystem. Modalities may be combined (or strung together) for answeringcomplex and subjective problems. Modality composition is the ability toexpress a hierarchy of modalities such that positive results from lowertasking are passed up to a composite modality which aggregates multiplemodality results to answer a higher-order question. The power ofcomposite modalities can include the fact that truth (or high-confidencedeterminations) is established at terminal modalities and that truth ispassed up to make very informed aggregate decisions.

For example, consider FIG. 3, which is a diagram illustrating compositemodalities to solve a higher-order problem. A security system has 2cameras with completely different fields of view; one (Camera-1) isinside the facility looking at a door and another (Camera-2) is outsidethe facility looking at a loading dock. The operator of the systemshould be alerted whenever someone enters the door and there is no truckin the loading dock. This problem (e.g., scenario) can be solved bycomposite modalities. Camera-1 can run an intrusion modality, whileCamera-2 can run a presence modality. Each of these cameras can producesensor data (e.g., artifacts) and provide the sensor data to theanalysis platform. The analysis platform can initiate modality taskingfor each of the two sensors independently. The security system operatorcan be alerted if there is an aggregate positive condition of bothwithin the same time frame. Events across all sub-modalities can berecorded and correlation can be performed whenever a sub-modalitytriggers a match.

Modality composition can be defined by specific rules that matchsub-modality results with each other to try and satisfy the compositemodality. Composite rules can have specific logic for composing theirsub-modalities. The logic can be augmented with customer input for rules(e.g., values) that should be used for a specific security system.

For example, the following composite modality rules can be defined: timesynchronization, match list, time series, and value coordinated. Timesynchronization rule attempts to match all sub-modality results thatoccur within the same time frame. The threshold of the time frame can becustomer defined. All times used are the times stamped by the asset(e.g., camera) at the precise time the artifact is collected. Match listrule attempts to match all last recorded sub-modality results when anyone of the sub-modalities has a new reporting. So, if sub-modality A hasa new report, it attempts to match on whatever the last recorded valuefor sub-modality B is at that time. Time series rule attempts to matchall sub-modality results which occur within a sequenced time period fromone to the next. The sequence order and time thresholds can be customerdefined. Value coordinated attempts to match all sub-modality resultsthat have specific values for answers given by the analytic platform.The values and matching criteria can be provided by the customer. Othercomposite modality rules are possible.

FIG. 4 is a process flow diagram illustrating a method 400 of augmentingartificial intelligence using composite modalities. At 410, sensor datais received from a first security system asset and sensor data isreceived of a second security system asset. For example, the assets caninclude a first camera and a second camera. Each camera need not haveoverlapping field of views.

At 420, a first predefined modality associated with the first securitysystem asset and a second predefined modality associated with the secondsecurity system asset can be accessed. The first modality can define acomputational task for analyzing the received first sensor data. Thesecond modality can define a second computational task for analyzing thereceived second sensor data. For example the first modality can be anintrusion modality and the second modality can be a presence modality.

At 430, a first solution state machine object and a second solutionstate machine object is instantiated. For example, each instatingcreates a concrete occurrence of the respective solution state machineobject that exists during runtime. Each respective solution statemachine object can have at least two intermediate states, one associatedwith machine computation component processing and one associated withagent computation component processing.

At 440, each task can be executed using their respective solution statemachine objects such that the processing includes processing by amachine computation component and by an agent computation component.After execution, each task has a result (for example, presence of aperson or intrusion is detected).

At 450, a composite result can be determined by applying a set of rulesto the results of the tasks. For example, the set of composite rules caninclude a rule requiring each modality result to be positive and thatthe sensor data that led to the positive results were obtained withinone minute of one another.

At 460, the composite result can be provided. The composite result canbe provided, for example, as part of an escalation policy to alert thesecurity system operator. The composite result may relate to aclassification of the sensor data (according to the task). The compositeresult can be provided, for example, as an alert to a manager of thesecurity system. For example, the security system manager may have anescalation policy that requires they be alerted regarding the outcome ofthe task if the task detects a certain condition (e.g., intrusion intothe building is occurring). The escalation alert can be in any form,such as MMS, SMS text, email, and the like.

In some implementations, the computational task includes: detecting apattern in the image; detecting a presence of an object within theimage; detecting a presence of a person within the image; detectingintrusion of the object or person within a region of the image;detecting suspicious behavior of the person within the image; detectingan activity of the person within the image; detecting an object carriedby the person, detecting a trajectory of the object or the person in theimage; a status of the object or person in the image; identifyingwhether a person who is detected is on a watch list (e.g., part of agallery of face images); determining whether a person or object hasloitered for a certain amount of time; detecting interaction amongperson or objects; tracking a person or object; determining status of ascene or environment (e.g., cleanliness, feeling of safety, weatherconditions); determining the sentiment of one or more people; countingthe number of objects or people; determining whether a person appears tobe lost (e.g., non-suspicious behavior); determining whether an event isnormal or abnormal; and determining whether text (e.g., license platetext) matches that in a database. Other tasks are possible as thecurrent subject matter can apply to a wide range of tasks.

As described above, the machine computation component can include anartificial intelligence (e.g., machine learning) system that developsand utilizes a predictive model. The machine computation component caninclude any number of algorithms. In some implementations, the machinecomputation component can include an artificial intelligence algorithm,a machine learning algorithm, a deep learning algorithm, a deep neuralnetwork, a convolutional neural network (CNN), a Faster Region-based CNN(R-CNN), and the like. For example, FIG. 12 is a system block diagram ofan example machine computation component system 1200 that implements adeep learning based object detector 1210. The object detector 1210includes a CNN for performing image processing including creating abounding box around objects in an image and detecting or classifying theobjects in the image. The input to the object detector is a digitalimage and the output is an array of bounding boxes and correspondingclass labels. An example input image and an example output isillustrated in FIG. 13. The class labels are: person, car, helmet, andmotor cycle.

In some implementations, Faster R-CNN incorporates flow information.This approach can reduce false alarms from the AI. A real time trackingmethod can be used. The real time tracking method uses data associationand state estimation techniques to correct the bounding boxes and removefalse positives. The tracking method assumes a linear velocity model andcomputes the location of the object in next frame using a Kalman Filtermethod.

Before an object detector can be used for detecting objects, it needs tobe trained. A training set can include one or more images with boundingboxes around objects the system is interested in detecting and thecorresponding class labels. A database of training images can be createdor maintained. In some implementations, the database can be updated overtime with real world images and labels.

Hard negative mining can better train the convolutional neural network.The example Faster R-CNN uses background patches in the image asnegative examples. In some implementations, since the number ofbackground patches is generally much larger than the number of objectpatches, all background patches cannot be included because doing sobiases the object detection model. A specific ratio (20:1) for negativeand positive examples can be maintained. Faster R-CNN can pick thesenegative examples randomly. For hard negative mining those negativeexamples that result into highest loss can be chosen. But this approachtrains the predictive model only for difficult and unusual examples ofobjects. So half the negative examples can be taken from hard negative(which give highest loss) and half of them taken randomly from rest ofthe negative examples.

In example implementations, a Faster R-CNN based object detector 1210 isused. The Faster R-CNN 1210 includes a bank of convolution layers 1220,a region proposal network (RPN) 1230, and an object classifier 1240. Thebank of convolution layers 1220 finds features that are useful for twopurposes: a) finding which rectangular regions in the image potentiallycontain an object of interest and b) correctly classifying the objectinside the proposed rectangular regions. The RPN 1230 looks at thefeature maps produced by the convolutional layers 1220 and proposesrectangular regions that may contain an object of interest. The objectclassifier 1240 looks at the feature maps and each region proposed bythe RPN 1230 and classifies each region as one of the objects ofinterest or not. The object classifier can generate a score from 0.0 to1.0 related to the confidence that the object is not present (0.0) orpresent (1.0). The classification can be binary or multiclass.

Training the object detector requires finding the rightweights/parameters associated with each of these three components.Manually labeled bounding boxes and object labels are used to guide theprocess of finding the correct weights using a backpropagationalgorithm. Using an alternate or additional training method, the RPN1230 is first trained and the region proposals are used to train theobject classifier 1240. The network tuned by object classifier can thenbe used to initialize RPN 1230, and this process is iterated. This waythe convolutional layer 1220 is tuned to be effective for both the RPN1230 and the object classifier 1240.

In the execution phase, a trained object detector 1250 is used to detectobjects (e.g., bounding boxes, class labels, and confidence levels) inan image not in the training set. In addition to the class label, thetrained object detector 1250 also returns the confidence measure forevery bounding box.

FIG. 14 is a system block diagram illustrating an object detector webAPI. A web server accepts requests from multiple clients and returns theresponse for the respective request. An application server runsapplications in threads, maintaining the correspondence between threadsand requests passed from the webserver. An application on theapplication server runs the object detection algorithms and returnsdetections in the form of objects to the application server, whichpasses the detections to the webserver, which passes the response to theclient machine.

In some implementations, a high-confidence output from the agentcomputation component can be used to train one or more artificialintelligence systems forming the machine computation component. When ahigh-confidence output is received from the agent computation component,the analysis platform can train an artificial intelligence system usingthe high-confidence agent computation component output as thesupervisory signal and the sensor data as the input signal. Thus, theanalysis platform can continually improve in performance and requirefewer agent computation component queries to perform the same amount ofwork. When the confidence measure returned by the machine computationcomponent is low, the image can be sent to an agent who can correct anymistakes in bounding boxes or labeling. Images that have incorrectbounding boxes and/or misclassified labels can be fixed and added to thetraining set. The system is continuously getting better as it isroutinely retrained after the addition of these harder examples to thetraining set.

FIG. 5 is a system block diagram of an example analysis platform 500that is a system of software components for combining machine and humanintelligence as a solution for responding to questions and problemscenarios, for example, relating to security. A customer can provide aproblem specification, desired questions or tasks to be performed, andraw inputs (e.g., sensor data such as video). The platform 500 canconfigure to provide answers or matches (e.g., results) to the customer.

The example platform 500 is a reactive system of cooperating softwarecomponents and the communication flows between them. A reactive systemis one that is responsive, resilient, elastic and message driven. Assuch, new functionality can be added to platform 500 easily to extendthe overall system capabilities. Each software component can include amicroservice, which can be a fully encapsulated and deployable softwarecomponent capable of communicating with other platform softwarecomponents by event-based messaging or directly.

Platform 500 system can be a distributed, open platform which canservice many projects simultaneously (multi-tenant). Being an openplatform means that there is well defined and formalized communicationand messaging specifications, by which, loosely-coupled participants caneasily join to enhance the overall system capabilities andfunctionality. The platform 500 provides many core services thatparticipating components will be able to utilize for common purposes,such as: common data formats; transaction logging/auditing; projectspecifications; monitoring and health management; message routing, humanand machine intelligence integration; and third party integrations.

In some implementations, platform 500 follows a microservicesarchitectural approach for rapidly building independent, functionallybounded components that collaborate to provide an end-to-end solution.Collaboration among components can be designed along both event-drivenand service-oriented architectures. Workflow orchestrations can be bothad-hoc by providing a core publish/subscribe system and formal viastandard web service representational state transfer (REST) APIendpoint.

Platform 500 includes an event messaging system 505 and a number ofdistributed microservices (510, 515, 520, 525, 530, 535, 540, 545, 550,and 555). The distributed microservices are components or modules of theplatform 500 and communicate via the event messaging system 505. Withregard to the event messaging system 505, a principal communicationmechanism for microservices is event-based messaging. Advanced MessageQueuing Protocol (AMQP) is an example protocol having distributed queuemanagement and publish/subscribe semantics.

A component of AMQP is the exchange 600, illustrated in FIG. 6. Anexchange accepts messages and routes them to queues according to thequeue binding type and/or subscription matches. Topic-based exchangesallow for consumer queue subscriptions with a routing key pattern,including both wildcards and explicit matching requirements. Messagesthat match a routing key are delivered to the consumer's queue. Anothercomponent of AMQP is the queue. A message queue may be either specificto a consumer or shared amongst consumers (worker queue). A consumermust acknowledge messages as processed from a queue. Messages that arenot acknowledged, by possibly a consumer exiting or crashing, will bere-queued for future delivery.

Whispering can be the ability for any component to directly conversewith any other component. The target component must support whisperingand must be listening to the global whisper exchange. The routing keyspecified by the caller designates which platform 500 component will getthe whisper message. If the whisper message is bi-directional, then thecaller must also provide a “reply-to” queue which will receive theresponse.

Referring again to FIG. 5, microservices include the smart mediaprocessor (SMP) 510, health and quality services 515, task directorservices 520, machine analytic services 525, data management services530, media management services 535, record keeping services 540, alertmessaging services 545, audit and record tracking 550, and agentmanagement services 555. Because the event messaging system 505 isflexible and expandable, additional or fewer microservices are possible.The platform 500 includes an interface to a customer 560, which caninclude a one or more security systems, each having one or more assetsproviding sensor data to the SMP 510.

Smart media processor 510 can include a software component that canprocesses one or more video stream sources and route workable multimediato the platform 500. It can be easily configured and modified via theplatform 500 communication to alter its operating behavior. It can alsobe tasked to obtain additional multimedia on demand (e.g.: x minute clipbefore/after some time for some asset).

Health and quality services 515 monitors all platform 500 participantsfor health and quality. Data management services 530 maintains customeraccount/project level and dynamic state data that all platform 500participants may need access to or contribute to.

Media management service 535 manages all multimedia resource dataobtained from customer assets and persists them in long-term storage.Alert messaging services 545 is responsible for determining the correctescalation procedures and executing them (notification, datacollections, and the like) when a task result has been achieved. Thiscan involve personal alarming, machine-to-machine integration or both.Alert messaging services can alert customers via defined mechanism (SMS,MMS, text, email, and the like) when triggered to do so. Record keepingservices 540 and audit and record tracking 550 can record all raw dataof platform 500 activity to a data warehouse and data lake for offlineanalysis and presentation.

Machine analytic services 525 integrate artificial intelligence and deepmachine learning into platform 500. The machine analytic services 525can include a machine computation component that includes an artificialintelligence (e.g., machine learning) algorithm that develops andutilizes a predictive model. Third party machine analytics services 527may also be utilized by platform 500.

Agent management services 555 is for managing all aspects of humaninteraction and judgment aggregation. The agent management services 555can include a platform that queries a pool of agents to process a taskby, for example, answering a question regarding sensor data.

Task director services 520 is responsible for progressing the state of atask, starting a task upon proper initiation triggers and determineswhen a task is completed for reporting. The task director services 520serves as the director of various processing tasks and requestsprocessing of task by, and receiving the results of processing from, themachine analytics services 525 and agent management services 555.

Within the platform 500 a task can be an instance of a modality inprogress, which can include a solution state machine object. As themodality is a definition of the problem objective, the solution statemachine is the “object” that maintains the state of processing for everytrigger event received from the assets. Tasks are the workload of theplatform 500. They can drive events and processing, and ultimately willend up as successful (accomplished the modality and satisfied thecustomer's requirements) or failed (did not accomplish the modality).All kinds of tasks can be in motion at any time within the platform 500and the event-driven nature of the platform 500 can continuously movetasks toward a final state as new information becomes available fromcomponents.

Reports are the data results generated by participants against aspecific task at a specific state. The task director 520 listens for allreports and uses the data in the report to determine the next state ofthe task. So, for example, if a task enters a NEED_AI state, there maybe multiple machine computation components that may start going to workto solve the current task. When each machine computation component hassomething report back, it will create a report and publish it to areports queue. Task director 520 will get these reports and use themeasurement data in them to determine next steps for the task.

The role of the alerts messaging service 545 or escalation manager is tolook at every successful “match” produced by the platform and determinethe appropriate means of distributing that information out to thecustomer. Depending on how the customer has configured their project,they may wish to receive immediate alerts to one or more cell phones, orthey may wish to have their internal system directly updated with theresult information, or they may want both. In any of these cases, it isthe escalation manager's 545 job to perform the proper routing ofresults to the customer.

Platform 500 uses escalation policies to help direct what should happenwhen results for tasks have been accumulated. The escalation manager 545listens for results and then consults appropriate escalation policies togovern next actions. Escalation policies can fall under 2 types, alertand machine-to-machine. An alert policy governs what should happen upona result to alert customers or customer representatives to the result. Amachine-to-machine policy governs what should happen upon a result withrespect to machine integration.

Alerts are push notifications to customers that indicate platform 500has determined a security scenario has been solved according to thematch solution state of the modality. An alert is specific to a modalityand will only be triggered for orphan modalities. When an alert istriggered, an alert-type escalation policy is either created or checkedto see if any previous alert has been acknowledged or not. Platform 500will only send a new alert if all previous alerts have beenacknowledged.

Machine-to-machine (M2M) is an integration strategy for having platform500 relay results directly to a customer system (possibly in addition toany alerts). Unlike alerts, M2M escalations can occur regardless of theresultant state of the solution (MATCH, NOMATCH). There are 2 modes ofM2M integration: direct M2M and web socket M2M.

Direct M2M is a mode of integration that implies that platform 500 isgoing to make direct HTTP POST call to the target system and submit thematch results. The target URL is provided in either one of two ways—ifthe escalation data policy data payload map has a “callback” entry, thenthe URL is taken directly from that entry. If the escalation data policydata payload does not have a “callback” entry, then it is assumed thecallback URL is given as meta-data with the artifact that was sent bythe sensor/camera. In this case, the asset must add the correct“callback” metadata to the artifact upload. Note that in either form ofthe DIRECT M2M mode, the callback URL must be accessible as a publicservice endpoint. This may mean that firewall port forwarding or othertechniques should be employed to allow traffic to flow from platform 500to a target system. If it is not possible to provide publicaccessibility to the target system callback endpoint, then the WEBSOCKET M2M mode should be used instead.

Web socket M2M is a mode of integration that utilizes a platform 500provided tool (MosaiqM2MRelay) which will relay M2M messages fromplatform 500 to the target system using a web socket connection toplatform 500. The MosaiqM2MRelay application must be run from within theinternal network and must have direct accessibility to the desiredtarget callback URLs. It will create a private and secure web socketconnection to platform 500, through which, any M2M messages for thetarget system will be relayed.

The target callback URL can be provided the exact same way as in theDIRECT M2M mode, however, the difference is that instead of POSTingdirectly to the callback from platform 500, the request is directedthrough a web socket to the MosaiqM2MRelay, which then proceeds todirectly call the target callback URL. The resultant POST request on thetarget system can be exactly the same in either case.

In order to speed integration efforts to platform 500 for external, aswell as internal participants, a platform SDK can be provided inmultiple languages in order to abstract away from the developer some ofthe core and necessary logic and communications. Some of these corecomponents to include in and SDK can be: lifecycle events, eventmessaging publications/subscribing, persistent data access, and thelike.

Each participant lifecycle can include startup and shutdown events thatwill signal to others in the platform 500 that a new capability is nowavailable or is now leaving. This registers the participant for uptimemonitoring by a monitoring manager.

Flow controls are the mini-workflows captured within a modality. A flowcontrol can be as simple or complex as needed to implement a particularmodality. Generally, modalities can then be combined to form scenarios.Flow controls are executed within the scope of a task.

A flow control can be a state machine. Upon transitioning to a state,generally, the task director 520 can publish the new state to the eventmessaging system 505. Interested platform 500 participants can thenaccept the state data and eventually publish reports back to the system.Task director 520 can listen for these reports and applies them to theflow control to see if it can transition to a new state. Transitioningis defined in the flow control definition as a collection of measurementcriteria sets. Each criteria set is applied to the report measurementgroups to see if there is an exact match. The first exact match willinitiate the transition rule for that criteria set (which is a new statetransition).

A fork/join is a special type of flow control construct in which asingle input is forked into multiple outputs, and then those multipleoutputs are joined into a single flow again. The task director 520 doesnot proceed to the next step until all reports have been received andaccumulated. There are two types of fork/join nodes: fork implicit andfork explicit.

The fork implicit type of fork node definition specifies a single forktransition step, which will be executed concurrently for each subgroupof the state input. A use case of this can be to execute certain logicfor each measurement group. The task director 520 creates independentevent publications for each sub-grouping and will coordinate the reportsinto a join continuation. A fork implicit can be defined as a singletransition node (the logic node that we want each subgroup to execute),with a single measurement criteria called “groupBy”. The value of the“groupBy” measurement will define how the node should create itsimplicit subgroups. The value of the “groupBy” measurement will definehow the node should create its implicit subgroups. Two types ofsub-grouping can be supported, namely, “measurementGroup” and“artifactGroup”.

Once the group of individual tasks that will be forked is determined,the forking logic can also look for a special “artifact_roi” key in thestate flow data. If this key is found, then it represents a measurementname that contains an ImageROI definition in each group. Task director520 can crop this region of interest from the original artifact and useit as the primary artifact of the forked task. If there is no“artifact_roi” key or if the measurement cannot be found on a specificgroup, then the forking artifact is used for that task. Thus helperlogic can be utilized when some region of interest artificialintelligence has run upstream of the forking to determine the regions onthe artifact.

The fork explicit type of fork node definition specifies explicit forktransitions for all the input data to be executed concurrently. A usecase of this can be to execute multiple independent forms of logic onthe same input set (e.g., different AI classifiers). Task director 520can create independent events and passes the same inputs to each one asdefined by the flow control.

Every fork/join node has a corresponding join node. The join node isdenoted in the fork definition as the last transition in the list andhas an empty list of criteria. When all reports have been accumulatedfor a fork, then the join node is executed. There are 2 types of joinnodes: join aggregate and join select.

Join aggregate type of join node aggregates all of forked tasks and addsa special set of report measurements to indicate the number of reasonstates for all the tasks. So, for example, the final join reportcontains “joined:END_SUCCESS”, “joined:END_FAIL”, and the like. Thesecan be used to decide next steps of the aggregation in the flow control.Additionally, the flow can provide a “filter” state flow data key whichcan be set to either “MATCH” or “NOMATCH”, which will filter all thefinal results to only those aggregated tasks that have that code. Joinselect type of join can select a single report from the forked reportsusing the specified criteria.

Every concurrent execution path for a fork node can execute its ownindependent flow control. These inner flow controls can begin and endlike any flow control with correct end states. When an end state isreach for the inner control flow, then that entire path is now marked asready for joining. When all forked inner flows complete, the taskdirector 520 can execute the join node logic and continue the flow.

A flow control can instruct an executing task to spawn another task forcontinuing the flow. In this case, the task that spawned the new task iscompleted with an END_SPAWNED reason. A flow can spawn a new task if theartifacts and/or goals of the flow have changed from when the originaltask started. For example, if a flow begins from an initial task with animage artifact and, through the flow, new artifacts are created as agrouping of regions from the original image. The flow can spawn a newtask whose artifact is the new artifact group for further processing,rather than the original single image artifact. In this case, spawningpreserves the original task for display (with a END_SPAWNED) reason, andcan continue the workflow on the new task.

More than one flow control can be used. For example, the following areexample flow control definitions.

Name Description auto- Immediately end in success. success- Finalartifact is the exact input artifact flow iq-verify- Immediatelyrequests agent computation. Answers from agent computation flow areverified against the modality match answers.   match with confidence >.8is ended successful   non-match with confidence >.8 is ended failed  any other result with confidence <=.8 is ended   no_confidence Finalartifact (upon success) is the exact input artifact iq-multi- Expects toreceive an input task that should be spawned into multiple agentverify-flow computation tasks based on either a measurementGroup orartifactGroup grouping. Each individual spawned task will execute theirown flow to completion. All answers for all tasks are verified againstthe modality match answers.   match with confidence >.8 is endedsuccessful   non-match with confidence >.8 is ended failed   any otherresult with confidence <=.8 is ended   no_confidence Final artifact foreach task (upon success) is the artifact created as part of the forklogic iq-annotate- Immediately requests agent computation. Answers arenot verified against flow any match answers.   any answer withconfidence >=.25 is ended successful   any answer with confidence <.25is ended no_confidence Final artifact (upon success) is the exact inputartifact ai-verify- Immediately request AI. Answers are verified againstthe modality match flow answers.   match with confidence >.8 is endedsuccessful   non-match with confidence >.8 is ended failed   any otherresult with confidence <=.8 is ended   no_confidence Final artifact(upon success) is the exact input artifact ai-detect- Initially performsAI event detection and gathers regions of interest (ROI) and-isolate-changes. For each ROI, requests AI, whose answers are verified againstthe flow modality match answers.   match with confidence >=.90 issuccessful and the flow   adds this report to the joined node  non-match with confidence >=.90 is failed and the flow   does not addthe report to the joined node   any other result with confidence <.9 isended   no_confidence and the flow does not add the report to the joined  node If the number of successful reports at the joined node >0, thenthe success report artifacts (ROIs) are grouped into a group artifactand a final task is spawned with an immediate end success for the entiremodality. If the number of successful reports at the joined node == 0,then the modality is finished immediately as failed. Final artifact(upon success) is a grouped artifact consisting of the successfullyclassified ROIs ai-iq-detect- Initially performs AI event detection andgathers regions of interest (ROI) and-isolate- changes. For each ROI,requests AI, whose answers are verified against the flow modality matchanswers. If the AI has insufficient confidence in it's answer, then thetask is sent to agent computation to validate the answer.   match withconfidence >=.95 is successful and the flow   adds this report to thejoined node   match with confidence <.95 is sent to agent computation  component for confirmation of the answer   any non-match result isended failed and the flow does not   add the report to the joined nodeFor the NEED_IQ confirmation:   match with confidence >=.90 issuccessful and the flow   adds this report to the joined node  non-match with confidence >=.90 is failed and the flow   does not addthe report to the joined node   any other result with confidence <.90 isended   no_confidence and the flow does not add the report to the joined  node If the number of successful reports at the joined node >0, thenthe success report artifacts (ROIs) are grouped into a group artifactand a final task is spawned with an immediate end success for the entiremodality. If the number of successful reports at the joined node == 0,then the modality is finished immediately as failed. Final artifact(upon success) is a grouped artifact consisting of the successfullyclassified ROIs ai-detect- Initially performs AI event detection, andthen sends the ROIs to the next verify-flow step of AI classificationusing the classifier on the same artifact. Each classified ROI isverified against the modality match answers for the first match.   matchwith confidence >=.90 is successful and the modality   ends successful  non-match with confidence >=.90 is failed and the modality   endfailed   any other result with confidence <.90 continues the flow to  the next step The next step, if reached, is to request agentcomputation component for the initial artifact. The answer is verifiedagainst the modality match answers.   match with confidence >=.80 issuccessful and the modality   ends successful   non-match withconfidence >=.80 is failed and the modality   ends failed   any otherresult with confidence <.80 is failed and the   modality endsno_confidence Final artifact (upon success) is the exact input artifact

FIG. 7 is a data flow diagram illustrating data flow between componentsof platform 500 during a process of augmenting artificial intelligencewith human intelligence tasks, for example, as described with referenceto FIG. 1. At 705, the task director 520 receives sensor data. The taskdirector can receive the sensor data using the event messaging system505. The task director can determine whether a predefined modalityexists for the asset from which the sensor data originated. At 710, thetask director 520 can send a request for a predefined modality from thedata manager 530. Data manger 530 can retrieve the predefined modalityfrom a database and, at 715, provide the task director 520 with thepredefined modality.

At 720, task director 520 can instantiate the solution state machinethat is specified by the predefined modality. The solution state machinecan have a number of states. Task director 520 can effectuate and directprocessing flow as specified by the solution state machine. By way ofexample, the remainder of the description of FIG. 7 assumes thepredefined modality specifies the example solution state machineillustrated in FIG. 2. The solution state machine is in the initialstate “S”, so task director 520 transitions the current state of thesolution state machine according to the transition rules, which resultsin the solution state machine having a current state of “MI”. “MI” stateis associated with a machine computation component, which in platform500 can be machine analytics 525. At 725, task director 520 requestsprocessing of the task by machine analytics 525. Machine analytics 525can process the task, for example, by performing image processing andclassifying the image. At 730, machine analytics 525 can send the resultof its processing of the task to task director 520, which can receivethe results. The results can include a confidence of the machineanalytics 525 result.

At 735, task director 520 can transition the state of the solution statemachine. For the example solution state machine illustrated in FIG. 2,the current state of the solution state machine can transition to either“MI2” or “HI” states depending on the confidence value returned bymachine analytics 525. Assuming the task is one that is challenging foran artificial intelligence algorithm to solve, and the confidence valuereturned by the machine analytics is low (e.g., 0.2), then task director520 can apply the transition rules (“C<0.3”) and transition the solutionstate machine to the “HI” state.

At 740, task director can request agent management services 555 toperform processing on the task. Agent management services 555 canreceive the prior processing result. Agent management service 555 canquery a pool of agents by submitting the sensor data and the agent formcontained in the predefined modality to one or more of the agents. Agentmanagement service 555 can receive the completed agent form from theagent (e.g., a client associated with the agent). Agent managementservice 555 can create a composite agent result where more than oneagent is queried and can determine a composite confidence measure. At745, agent management service 555 can send the query result andconfidence measure to task director 520.

At 750, task director 520 can advance the current state of the solutionstate machine. In the case that the confidence measure received fromagent management service 555 is not definitive (e.g., 0.5), taskdirector can apply the transition rules (e.g., 0.9<C<0.4) and transitionthe solution state machine to state “MI2”, which is associated withanother machine computation component.

Task director 520 can, at 755, request processing of the task by themachine analytics 525 component. Machine analytics 525 can process thetask, for example, by performing image processing and classifying theimage. The underlying artificial intelligence system used can be adifferent system than that used in steps 725 and 730. In someimplementations, the underlying artificial intelligence system used canbe the same but can use the prior agent management 555 result and/or theprior machine analytics 525 result. In this manner, machine analytics525 can either try a new approach (e.g., an ensemble) or refine previousresults.

At 760, machine analytics 525 can send the result of its processing ofthe task to task director 520, which can receive the results. Theresults can include a confidence of the machine analytics 525 result.

At 765, task director 520 can transition the state of the solution statemachine. Assuming the machine analytics 525 result was a high confidence(e.g., 0.95), task director 520 can transition the solution statemachine to the terminal state “ES”, which signifies that the task iscompleted with high confidence and so the task processing has beensuccessful.

At 770, task director 520 can provide the outcome of the taskprocessing, which can include whether or not platform 500 was able tocome to a high-confidence output and the classification, matching, ordetermination of the sensor data. (For example, task director 520 canprovide whether the processing outcome is accurate and whether platform500 detected the presence of a person in the sensor data image.)

While the data flow illustrated in FIG. 7 is described as havingcomponents of platform 500 send and/or receive data directly from eachother, it should be understood that the sending and receiving can be viathe event messaging system 505. Further, the event messaging system 505is not the only protocol that can be implemented with the currentsubject matter.

Platform 500 can include a core set of common data structures. Anyparticipant may define, store and exchange proprietary data that is notdefined for its own purposes or others.

Platform 500 can provide a means of persisting long-lived data in such away that is efficient, scalable and extensible to changes and evolution.Persistent data can be both general purpose and proprietary. Generalpurpose data can be that which is meaningful and required by allparticipants within the platform, such as, customer and projectspecifications. General purpose data formats will be defined by platform500 and APIs will be exposed for accessing and contributing data.Proprietary purpose data can be that which is specific to a particularparticipant and/or functionality which does not require sharing outsideof its owner. An example of this can include agentperformance/compensation records. Participants with proprietary datarequirements can be responsible for establishing their own storagemanagement.

Customer and project data can describe everything about a particularcustomer and project. A customer can have multiple projects. Eachproject can contain the information necessary to understand the rawinput sources, the problem specification, the points of contact, and thelike. A customer can have a unique identifier as well as each projectwithin that customer. Hence, a customer with identifier C123 could haveprojects P1 and P2 and the canonical representation of C123::P1 willuniquely identify that project for that customer across the globalplatform 500 namespace.

Both static and dynamic information can be used. Static information canbe the data representing the customer account, as well as, the problemspecifications and details. Static data can be updated over time, but itis not generally going to change in the course of task execution.Dynamic data, on the other hand, can be continuously created and updatedover the course of any task execution.

Raw data can include raw dumping (e.g., data lake) representation of allthe runtime data generated within the platform. A purpose of this datais to provide a source for auditing flow streams to replay data orexamine triggers that resulted in a particular result. Additionally,this data source can serve for deeper analytics to calculate solutionprecision/recall, as well as, serving as an endpoint for 3rd partyintegrations with project solutions and results.

Platform 500 can save both structured and unstructured data for futurebusiness intelligence. The persistence technology for this data cansupport both unstructured (or semi-structured) data and fully-structureddata sets, which can support arbitrary querying and OLAP queries. Sometechnologies able to handle this type of data at large scale and minimalcosts include HADOOP/SPARK; AWS REDSHIFT (OLAP); and MONGODB (NOSQL).Redshift provides a large amount of out-of-box integration capabilitywith existing business intelligence tools, such as Tableau or Periscope.Hadoop/Spark provides a common architecture and toolset for datascientists to work with; underlying raw, unstructured data can bemanipulated and transformed to produce structured insights and resultswhich can then be visualized by business intelligence visualizationtools.

Platform 500 can include scalable streaming middleware, which can handlebuffering, batching and applying the data to the target persistencetechnology.

Multimedia data can be a source of multimedia resource data (e.g.,video, images, sounds, and the like) that is presented to the platform500 for processing. Resource data can be stored in native format,associated with a globally unique identifier and can be directlyaccessible by platform 500 participants. Platform 500 can provide an SDKand/or service API (e.g., multimedia manager) for storing and retrievingraw multimedia data. Internally these resources can be saved topersistent storage either by remote API calls or directly local filesystem. FIG. 8 is a block diagram illustrating some meta-data of theplatform 500.

Although a few variations have been described in detail above, othermodifications or additions are possible. For example, platform 500 canbe cloud capable, as opposed to, cloud based. The purpose of this can beto leverage cloud technology and infrastructure as much as possible andwhen possible. When it is not possible, such as deployment within asecure facility or environments without internet accessibility, then allmajor core components of platform 500 can be executable and can operatenormally without cloud access. Running platform 500 within a cloudinfrastructure can provide benefits, including: virtually unlimitedstorage and compute processing, integration with other public services,centralized monitoring, detached resource dependencies and more. Runningplatform 500 within a non-cloud/local environment can require dedicatedresources. One or more components of platform 500 can be internallywithin a customer facility and reach out to a larger, cloud hosted suiteof platform 500 components for processing.

The following describes another example implementation of the currentsubject matter.

In some implementations, by including the human-in-the-loop the currentsubject matter is able to accomplish new kinds of tasks entirely (e.g.,those that require human intelligence).

The current subject matter relates to utilizing a “human-in-the-loop”(symbiotic human-machine) approach in order to enable new capabilitiesof automated or non-automated machine decision systems by, for example,reducing false alarms associated with sensors and analytics as well asexpand the range of use cases to which a machine decision making systemor a given sensor and/or analytic may effectively apply. In someimplementations, the current subject matter can provide for injection ofa human-computation element into a machine decision-making algorithm,allowing for a human to perform (or solve) specific and narrow decisionsthat the machine decision making system would otherwise be unable toperform (or would perform poorly). The current subject matter can expandthe range of use cases that a machine decision making system or a givensensor and/or analytic may effectively apply. The subject matter can beused with applications that do not current include machinedecision-making algorithms, for example a closed circuit televisionsystem that currently does not have a machine decision-making algorithm.The current subject matter can enable new capabilities and improvemachine decision making, for example, by improving performance ofcorrect classification, which can provide one or more of reducing falsealarms, increasing performance of detection (e.g., hit), increasingperformance of correctly determining a miss, and increasing performanceof determining a correct rejection.

FIG. 15 is a system block diagram illustrating an example system 1500that provides for injection of a human-computation element into amachine decision-making algorithm. The system 1500 may include a sensor1505, analytics 1510, controller 1515, user interface 1520, and humancomputation element 1525.

The sensor 1505 may include a variety of sensor types: imaging,acoustic, chemical, radiation, thermal, pressure, force, proximity, or anumber of other sensor types. “Sensor,” as used herein may includeinformation that did not originate specifically from physical hardware,such as a computer algorithm.

Analytics 1510 may include a wide range of software analytics anddevelopment processes, which are methods and techniques that typicallyrely on gathering and analyzing information from sensor 1505. Analytics1510 may include, but are not limited to, face recognition, peoplecounting, object recognition, motion detection, change detection,temperature detection, and proximity sensing. Analytics 1510 may addressa user's query of the system 1500 (e.g., a face recognition analytic ifthe user desires to understand who is entering his or her building). Italso may serve to reduce the amount of sensor information sent to thehuman computation element 1525, or the amount of bandwidth, memory,computation, and/or storage needed by the system 1500. In someconfigurations, the system output can be obtained at low latency, inreal-time or near (e.g., substantially) real-time.

Controller 1515 may include a tool that utilizes the output andcharacteristics of the sensor 1505 and/or analytics 1510 in conjunctionwith internal logic and/or in conjunction with a predictive model ofhuman and machine performance to determine whether and how to utilizehuman computation element 1525. Controller 1515 may determine thatinformation generated by sensor 1505 and/or analytics 1510 is sufficientto answer a given user query or given task, or controller 115 mayoutsource certain tasks to humans (via human computation element 1525)based on system objectives and controller 1515 internal logic and/or apredictive model of human and machine performance. Controller 1515 maycoordinate, via human computation element 1525, use of humanintelligence to perform tasks that augment, validate, replace, and/orare performed in lieu of sensor 1505 and/or analytics 1510. Controller1515 may be capable of collecting, interpreting, and/or integrating theresults of human work into the machine decision making process andsystem. Controller 1515 may be capable of converting a user-definedtask, that is either defined via natural language or via a morestructured query, into a smaller task or series of smaller tasks, as itdeems necessary, and into an output for an end user, using either sensor1505 and/or analytics 1510 or human computation element 1525, or both.

In addition, controller 1515 may maintain statistics pertaining to theperformance of sensor 1505 and/or analytics 1510 as well as humancomputation element 1525 and/or individual human workers or asubpopulations of workers. These statistics may be used to improve themeans of utilizing machine and human elements of the pipeline. System1500 may be capable of gathering data that may be useful for improvingthe performance characteristics of system 1500, sensor 1505 and/oranalytics 1510, or the human computation element 1525. Typically thesedata are selected because they are examples for which the sensor 1505and/or analytics 1510 have low relative certainty or they are examplesthat are informative for improving characteristics of sensor 1505 and/oranalytics 1510.

Human computation component 1525 utilizes human intelligence. A purposeof human computation element 1525 may be to aid system 1500 in itsability to address AI-hard or AI-complete problems that are difficult orimpossible to solve reliably and/or cost effectively with sensor 1505and/or analytics 1510 (e.g., software analytic technology) alone.Another purpose of incorporating human intelligence may be to performtasks that augment or validate the sensor 1505 and/or analytics 1510 ofsystem 1500. One example of this is using humans to validate the outputof a computer vision analytic via a micro task involving imagery. Humancomputation element 1525 may also aid in the translation of tasksreceived by users. Task translation may range from none (e.g., if thetask is given directly to humans) to minimal (e.g., if the task is givenpartly to computers and partly to humans, would benefit fromformalization, or is decomposed and then executed by either computers orhumans) to substantial (e.g., if the system determines it may be able toimprove its effectiveness by translating the task substantially). Thesystem may distribute a task in order to manage and improvecharacteristics such as throughput, latency, accuracy, and cost. Humansmay also contribute innovative solutions into the system 1500, makeincremental changes to existing solutions, or perform intelligentrecombination. Human computation element 1525 may function as part of anongoing process, which may be aimed at real-time or near-real timeapplications as well as at applications that require results at lowerfrequencies. System 1500 may utilize a task market such as AMAZON®Mechanical Turk, but is built in such a way that it may also incorporatemany different kinds of human workers worldwide via other crowd workplatforms or via a custom system interface. Examples of other crowdworkers may include employees of an enterprise, off-duty or retired lawenforcement professionals, subject matter experts, or on-duty personnel.The system may include a process for establishing and verifyingcredentials of the crowd workers for the purpose of meeting systemobjectives or improving system efficiency. Incentives to participationmay include monetary compensation, volunteerism, curiosity, increasingreputation/recognition, desire to participate in a game-like experience,other motivation sources, and the like.

The end user interface 1520 may include an interface that combinesalerts with a human-like means of interaction.

System 1500 is a closed loop system that can use sensor 1505 and/oranalytics 1510 performance characteristics as well as human inputs (fromthe human computation element 1525 or from an end user) to improve itsunderlying performance characteristics relative to the challenges (e.g.,AI-hard or AI-complete problems) the system 1500 confronts. The system1500 may incorporate a scheme for collecting useful “ground truth”examples that correspond to these challenges. Data collected by system1500 may be used to improve system characteristics using machinelearning or other statistical methods.

FIG. 16A is a process flow diagram illustrating a method 1600 ofinjecting human-computation into a machine decision-making algorithm,allowing for a human to perform (or solve) specific and narrow decisionsthat the machine decision making system would otherwise be unable toperform (or would perform poorly). The particular example application ofFIG. 16A is to detect graffiti using a vapor sensor and an imagingsensor. At 1605, a user may define a question to be answered by thesystem. For example, a user may define a question regarding whethergraffiti is occurring (or has occurred) and who may be involved in thegraffiti.

At 1610, the question may be translated into something that can beaddressed programmatically with hardware, software, and humans. Forexample, the pseudocode at Table 1 may be used, which enables the humanin the loop to work alongside one or more sensors to aid in solving morecomplex tasks. In the example of table 1, if the vapor sensor isconfident that there is nothing there, then end (no need to involve ahuman). If it is confident there is a high vapor condition, send areport (again, no need to involve a human). If there is mediumconfidence, ask the human in the loop to weigh in on the situation andinform the answer.

TABLE 1   if (vapor_sensor.ppm > 250) if (vapor_sensor.ppm > 750) if (camera.person_holding_can )  sendreport( ) if (camera.person_holding_can )  sendreport( ) If ( vapor_sensor < 50ppm) End If ( vapor_sensor > 750ppm)  sendreport( ) If ( vapor_sensor isbetween 50ppm and 750ppm) Get_human_answer_on_whether_graffiti_occurring( )

At 1615, a sensor (e.g., sensor 1505) assess the situation (takes ameasurement) and makes a decision (or guess), for example, low, medium,or high levels of vapor. For example, for vapor_sensor.ppm, sensor ismaking its best guess as to whether a condition (detecting a vapor levelassociated with spray paint) exists. If the decision is negative (novapors) then no graffiti is occurring and the assessment may terminate.If the decision is a medium level of vapors, there may or may not begraffiti, and human computation element 1525 may be employed, at 1620,to inject a human decision or review of the sensor assessment. The humanmay review the sensor data and render a decision regarding whethergraffiti is occurring. The high, medium, or low assessment by the sensormay be a function of the receiver operating characteristics (ROC) of thesensor and may vary.

If the human-decision indicates that graffiti is occurring, or the vaporsensor indicates with high reliability that vapor is present and sograffiti is occurring (so that no human input is required), at 1625 asecond sensor, such as an imaging sensor, can assess the situation(e.g., take a measurement). FIG. 16B illustrates an example imagecontaining graffiti. The imaging sensor may also render a decision withlow, medium, and/or high likelihood that the imaging sensor hasidentified who is creating the graffiti. Like with the vapor sensor, ifthe imaging sensor is confident in its determination, the system mayproceed directly to 1630, where a report can be issued or no actiontaken. However, if the imaging sensor renders a decision with lowconfidence, at 1625, human-computation element 1525 may be used to allowa human make the determination. The human may also weigh in using datafrom the vapor sensor and imaging sensor, if the vapor sensor couldn'tbenefit from human insight by itself or if it is costless to engage theimaging sensor.

Thus, the example method 1600 allows for adaptive behavior based on theconfidence of decisions (or assessments) made by sensors. For example,if vapor sensor and imaging sensor both render confident results, theprocess may involve machine only decision making; if either vapor sensoror imaging sensor renders a not confident result (e.g., increasedlikelihood of an incorrect decision) then a human computation elementmay be injected into the machine decision loop to render a decision. Themethod may close the loop and allow human-generated ground truth toimprove the algorithms used to process sensor data, the confidencethreshold for each sensor, the weight of each sensor's information inthe overall solution, and more.

FIG. 17 is a system block diagram illustrating an example implementationof the current subject matter for video/face recognition system 1700.The face recognition system 1700 may be able to determine whether peopleon a “watch list” (or people who belong to any notable subpopulationsuch a very important persons (VIPs), frequent shoppers, securitythreats, and the like) are entering a given facility. The facerecognition system includes a video sensor 1705, an image analysisanalytics 1710, a controller 1715, user interface 1720,human-computation element 1725, and learner 1730.

The video sensor 1705 can acquire images (for example, of a person),which are analyzed by the image analysis analytics 1710, which cangenerate a determination whether a person in the image is on the watchlist. The controller 1715 can receive the decision and, based on ameasure of confidence of the decision, determine whether to employ thehuman computation element 1725 to verify the decision. If the humancomputation element 1725 is employed, using, for example, MechanicalTurk or similar service, a human will review the image with possiblecandidates from the watch list to determine if there is a match. Whenthe face recognition analytics 1710 is incorrect (as identified by thehuman-computation element 1525), the human analysis of the mistake 1735may input to a learner 1730, which can use the data point to train theface recognition analytics 1710 to further train the face recognitionanalytics 1710 and improve performance. Thus, the human computationelement aids in improving the performance of the machine element overtime and provides feedback.

FIGS. 18 and 19 are process flow diagrams illustrating using the currentsubject matter for face recognition and using the face recognitionsystem 1700.

The system describe in FIG. 17 may not be limited to face recognition.For example, the system may be used to handle a wide variety of tasks,such as counting sports utility vehicles (SUVs) in a parking lot orvalidating computer vision analytic performance as shown in FIGS. 20 and21.

In some implementations, the current subject matter may incorporate anautomatic enrollment process whereby a user may contribute examples ofdata that are either positive examples, negative examples, or examplesthat are necessary for effective system operation. The current subjectmatter may efficiently solicit, gather, and catalogue these data. Forinstance, in the case of face recognition, users may contribute imagesof people whom they desire to identify, and the current subject mattermay gather and catalogue these face images. These images may be used totrain analytics and/or humans as well as to guide system outputsaccording to the system's internal logic and the need expressed by theuser.

Configuration examples may include:

-   -   Systems addressing physical security, safety, or asset        protection needs.    -   Systems addressing the improvement and/or monitoring of retail        environments.    -   Systems addressing real-time sensor feeds.    -   Systems addressing historic sensor feeds.    -   Systems incorporating multiple sensors.    -   Systems addressing residential, education, medical, financial,        entertainment, industrial, transportation, commercial, law        enforcement, military, or governmental applications.

FIG. 22 is a block diagram illustrating an example of hardware 2200 usedby the current subject matter, which may include one or more sensorscoupled with a CPU and/or GPU. The device may perform a portion of itsprocessing locally (onboard device) and a portion of its processingremotely (e.g., using cloud-based computation). This computationalscheme may be in place in order to efficiently utilize bandwidth,storage, and device memory, while facilitating the efficientimplementation of the aforementioned human-in-the-loop process. Thehardware may be designed in such a way that additional sensors arereadily supported via a bus-modular system approach. In addition, thehardware incorporates a means to communicate through a network, such asWiFi or Cellular network.

-   -   Although a few variations have been described in detail above,        other modifications or additions are possible. For example, the        current subject matter is not limited to the security domain,        but can extend to voice recognition and other domains including        physical security, safety, or asset protection; the improvement        and/or monitoring of retail environments; real-time sensor        feeds; historic sensor feeds; multiple sensors; residential,        education, medical, financial, entertainment, industrial,        transportation, commercial, law enforcement, military, and/or        governmental applications.

One or more aspects or features of the subject matter described hereincan be realized in digital electronic circuitry, integrated circuitry,specially designed application specific integrated circuits (ASICs),field programmable gate arrays (FPGAs) computer hardware, firmware,software, and/or combinations thereof. These various aspects or featurescan include implementation in one or more computer programs that areexecutable and/or interpretable on a programmable system including atleast one programmable processor, which can be special or generalpurpose, coupled to receive data and instructions from, and to transmitdata and instructions to, a storage system, at least one input device,and at least one output device. The programmable system or computingsystem may include clients and servers. A client and server aregenerally remote from each other and typically interact through acommunication network. The relationship of client and server arises byvirtue of computer programs running on the respective computers andhaving a client-server relationship to each other.

These computer programs, which can also be referred to as programs,software, software applications, applications, components, or code,include machine instructions for a programmable processor, and can beimplemented in a high-level procedural language, an object-orientedprogramming language, a functional programming language, a logicalprogramming language, and/or in assembly/machine language. As usedherein, the term “machine-readable medium” refers to any computerprogram product, apparatus and/or device, such as for example magneticdiscs, optical disks, memory, and Programmable Logic Devices (PLDs),used to provide machine instructions and/or data to a programmableprocessor, including a machine-readable medium that receives machineinstructions as a machine-readable signal. The term “machine-readablesignal” refers to any signal used to provide machine instructions and/ordata to a programmable processor. The machine-readable medium can storesuch machine instructions non-transitorily, such as for example as woulda non-transient solid-state memory or a magnetic hard drive or anyequivalent storage medium. The machine-readable medium can alternativelyor additionally store such machine instructions in a transient manner,such as for example as would a processor cache or other random accessmemory associated with one or more physical processor cores.

To provide for interaction with a user, one or more aspects or featuresof the subject matter described herein can be implemented on a computerhaving a display device, such as for example a cathode ray tube (CRT) ora liquid crystal display (LCD) or a light emitting diode (LED) monitorfor displaying information to the user and a keyboard and a pointingdevice, such as for example a mouse or a trackball, by which the usermay provide input to the computer. Other kinds of devices can be used toprovide for interaction with a user as well. For example, feedbackprovided to the user can be any form of sensory feedback, such as forexample visual feedback, auditory feedback, or tactile feedback; andinput from the user may be received in any form, including, but notlimited to, acoustic, speech, or tactile input. Other possible inputdevices include, but are not limited to, touch screens or othertouch-sensitive devices such as single or multi-point resistive orcapacitive trackpads, voice recognition hardware and software, opticalscanners, optical pointers, digital image capture devices and associatedinterpretation software, and the like.

In the descriptions above and in the claims, phrases such as “at leastone of” or “one or more of” may occur followed by a conjunctive list ofelements or features. The term “and/or” may also occur in a list of twoor more elements or features. Unless otherwise implicitly or explicitlycontradicted by the context in which it is used, such a phrase isintended to mean any of the listed elements or features individually orany of the recited elements or features in combination with any of theother recited elements or features. For example, the phrases “at leastone of A and B;” “one or more of A and B;” and “A and/or B” are eachintended to mean “A alone, B alone, or A and B together.” A similarinterpretation is also intended for lists including three or more items.For example, the phrases “at least one of A, B, and C;” “one or more ofA, B, and C;” and “A, B, and/or C” are each intended to mean “A alone, Balone, C alone, A and B together, A and C together, B and C together, orA and B and C together.” In addition, use of the term “based on,” aboveand in the claims is intended to mean, “based at least in part on,” suchthat an unrecited feature or element is also permissible.

The subject matter described herein can be embodied in systems,apparatus, methods, and/or articles depending on the desiredconfiguration. The implementations set forth in the foregoingdescription do not represent all implementations consistent with thesubject matter described herein. Instead, they are merely some examplesconsistent with aspects related to the described subject matter.Although a few variations have been described in detail above, othermodifications or additions are possible. In particular, further featuresand/or variations can be provided in addition to those set forth herein.For example, the implementations described above can be directed tovarious combinations and subcombinations of the disclosed featuresand/or combinations and subcombinations of several further featuresdisclosed above. In addition, the logic flows depicted in theaccompanying figures and/or described herein do not necessarily requirethe particular order shown, or sequential order, to achieve desirableresults. Other implementations may be within the scope of the followingclaims.

What is claimed is:
 1. A method comprising: receiving sensor data;classifying the sensor data into one of two or more classes by at leastrequesting processing of a machine computational component, receiving aresult of the machine computation component, requesting processing of anagent computation component, and receiving a result of the agentcomputation component, the agent computation component including aplatform to query an agent; and providing the result from the agentcomputation component or the result from the machine computationcomponent.
 2. The method of claim 1, wherein processing of the agentcomputation component is requested when a confidence of the machinecomputation component result is below a first threshold.
 3. The methodof claim 2, wherein processing of the agent computational component isrequested when the confidence of the machine computation componentresult is above a second threshold.
 4. The method of claim 1, whereinthe providing includes requesting further processing of the agentcomputation component result by the machine computation component. 5.The method of claim 1, wherein the providing includes requesting furtherprocessing of the machine computation component result by the agentcomputation component.
 6. The method of claim 1, wherein the machinecomputation component includes a deep learning artificial intelligenceclassifier.
 7. The method of claim 6, wherein the machine computationcomponent detects objects and classifies objects in the sensor data, thesensor data including an image.
 8. The method of claim 1, furthercomprising: determining a composite result from the machine computationcomponent result and the agent computation component result and using ameasure of result confidence.
 9. The method of claim 1, wherein at leastone of the receiving, classifying, and providing is performed by atleast one data processor forming part of at least one computing system.10. A method comprising: receiving sensor data of a security systemasset; accessing a predefined modality associated with the securitysystem asset, the modality defining a computational task for analyzingthe received sensor data; instantiating a solution state machine objecthaving a plurality of states and rules for transitioning between theplurality of state, the plurality of states including an initial state,a first intermediate state, a second intermediate state, and a terminalstate; executing the task using the solution state machine object, theexecuting including: requesting processing of the task by, and receivinga result of, a machine computation component when a current state of thesolution state machine object is the first intermediate state, theresult received from the machine computation component including a firstconfidence measure; requesting processing of the task by, and receivinga result of, an agent computation component when the current state ofthe solution state machine object is the second intermediate state, theresult received from the agent computation component including a secondconfidence measure; and transitioning the current state of the solutionstate machine object according to the transition rules and at least oneof: the first confidence measure and the second confidence measure; andproviding a characterization of the terminal state when the currentstate of the solution state machine object is the terminal state. 11.The method of claim 10, wherein the machine computation componentexecutes a machine learning algorithm to perform the task.
 12. Themethod of claim 11, wherein the machine computation component includes aconvolutional neural network.
 13. The method of claim 10, wherein theagent computation component includes a platform that queries at leastone agent, receives a query result, determines a confidence measure ofthe agent, and determines the second confidence measure using theconfidence measure of the queried agent.
 14. The method of claim 10,wherein the sensor data includes an image including a single image, aseries of images, or a video; and the computational task includes:detecting a pattern in the image; detecting a presence of an objectwithin the image; detecting a presence of a person within the image;detecting intrusion of the object or person within a region of theimage; detecting suspicious behavior of the person within the image;detecting an activity of the person within the image; detecting anobject carried by the person, detecting a trajectory of the object orthe person in the image; a status of the object or person in the image;identifying whether a person who is detected is on a watch list;determining whether a person or object has loitered for a certain amountof time; detecting interaction among person or objects; tracking aperson or object; determining status of a scene or environment;determining the sentiment of one or more people; counting the number ofobjects or people; determining whether a person appears to be lost;determining whether an event is normal or abnormal; and/or determiningwhether text matches that in a database.
 15. The method of claim 10,wherein the security system asset is an imaging device, a video camera,a still camera, a radar imaging device, a microphone, a chemical sensor,an acoustic sensor, a radiation sensor, a thermal sensor, a pressuresensor, a force sensor, or a proximity sensor.
 16. The method of claim10, wherein the modality defines: solution state machine objectattributes, acceptable confidence for reaching the terminal state, a setof assets that trigger the modality, and agent query structure.
 17. Themethod of claim 10, wherein executing the task includes posting, via amessaging queuing protocol, requested processing tasks, and wherein themachine computation component and agent computation component aremicroservices operating on tasks posted via the messaging queueprotocol.
 18. The method of claim 10, further comprising: modifying apredictive model of the machine computation component using the resultreceived from the agent computation component as a supervisory signaland the received sensor data as input.
 19. The method of claim 10,wherein at least one of the receiving, accessing, instantiating,executing, and providing is performed by at least one data processorforming part of at least one computing system.
 20. A non-transitorycomputer program product which, when executed by at least one dataprocessor forming part of at least one computer, result in operationscomprising: receiving sensor data; classifying the sensor data into oneof two or more classes by at least requesting processing of a machinecomputational component, receiving a result of the machine computationcomponent, requesting processing of an agent computation component, andreceiving a result of the agent computation component, the agentcomputation component including a platform to query an agent; andproviding the result from the agent computation component or the resultfrom the machine computation component.
 21. The computer program productof claim 20, wherein processing of the agent computation component isrequested when a confidence of the machine computation component resultis below a first threshold.
 22. The computer program product of claim21, wherein processing of the agent computational component is requestedwhen the confidence of the machine computation component result is abovea second threshold.
 23. The computer program product of claim 20,wherein the providing includes requesting further processing of theagent computation component result by the machine computation component.24. The computer program product of claim 20, wherein the providingincludes requesting further processing of the machine computationcomponent result by the agent computation component.
 25. A systemcomprising: a media processor that receives sensor data; and means forclassifying the sensor data into one of two or more classes by at leastrequesting processing of a machine computational component, receiving aresult of the machine computation component, requesting processing of anagent computation component, and receiving a result of the agentcomputation component, the agent computation component including aplatform to query an agent.
 26. The system of claim 25, furthercomprising: means for requesting further processing of the agentcomputation component result by the machine computation component; andmeans for requesting further processing of the machine computationcomponent result by the agent computation component.